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Abstract: The TLRs and IL-1 receptors have evolved to coordinate the innate immune 
response following pathogen invasion. Receptors and signalling intermediates of these 
systems are generally characterised by a high level of evolutionary conservation. The 
recently described IL-lRl co-receptor TILRR is a transcriptional variant of the FREMl 
gene. Here we investigate whether innate co-receptor differences between teleosts and 
mammals extend to the expression of the TILRR isoform of FREMl. Bioinformatic and 
phylogenetic approaches were used to analyse the genome sequences of FREMl trom 
eukaryotic organisms including 37 tetrapods and five teleost fish. The TILRR consensus 
peptide sequence was present in the FREMl gene of the tetrapods, but not in fish orthologs 
of FREMl, and neither FREMl nor TILRR were present in invertebrates. The TILRR gene 
appears to have arisen via incorporation of adjacent non-coding DNA with a contiguous 
exonic sequence after the teleost divergence. Comparing co -receptors in other systems, 
points to their origin during the same stages of evolution. Our results show that modern 
teleost fish do not possess the IL-IRI co-receptor TILRR, but that this is maintained in 
tetrapods as early as amphibians. Further, they are consistent with data showing that 
co-receptors are recent additions to these regulatory systems and suggest this may underlie 
differences in innate immune responses between mammals and fish. 
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regulator; NF-kB: nuclear factor kappa B; NCBI: National Center for Biotechnology Information; 
FREM-1: FRAS Related Extracellular Matrix gene 1. 

1. Introduction 

The innate immune system is generally well conserved throughout the animal kingdom with the 
same characteristic features of regulatory components present in species ranging trom insects to 
mammals [1]. Activation is induced through members of the Toll-like and IL-1 receptor (TIR) family, 
characterized by the cytoplasmic TIR domain [2]. The high level of conservation of the intracellular 
domains of both Toll and IL-lRl and of cj'toplasmic regulatory components is consistent with a 
signaling system that is broadly conserved throughout evolution prior even to the divergence of 
plants [3] and animals [4]. However, it is increasingly recognized that mechanisms of ligand 
recognition and co-receptor association, a potent regulator of signal amplification at the level of the 
receptor complex, are less well conserved [5,6]. In evolutionary terms, such co-receptors appeared 
relatively late in the development of their respective signaling networks which they control. 

Recent studies have revealed that fish, which have been shown to possess certain inflammatory 
receptors, may lack co-receptors found in more modern organisms, suggesting that signaling 
mechanisms in earlier species are functionally distinct and less refined. Thus, for example the 
zebrafish (Danio rerio) possesses two paralogs of TLR4, neither of which is stimulated by LPS, and 
lacks the co-receptors MD2 and CD14 [7,8]. Similarly, phylogenetic studies of the synteny of the 
syndecan genes in fish and tetrapods has revealed that while the four mammalian syndecan genes arose 
due to gene duplication, Syndecan 1 (an FGFR co-receptor) is absent from fish genomes probably as a 
result of deletion following this duplication event [9]. 

We recently identified the IL-IR co-receptor, TILRR (ToU-like/IL-l receptor regulator), a 715 
amino acid heparan sulfate glycoprotein encoded within the gene for the extracellular matrix protein 
FREMl [10]. FREMl has a distinct function in embryogenesis and development, and is ubiquitously 
expressed [11]. 

TILRR binds the cell membrane through a C-terminal lectin domain, associates with IL-lRl and 
increases receptor expression and ligand binding. TILRR association potentiates recruitment of the 
MyD88 adapter and receptor signal amplification, and enhances activation of NF-kB and inflammatory 
genes [10]. We earlier confirmed expression and function of TILRR in mouse and human cells [10]. 

The current studies examine the presence of TILRR throughout evolution and demonstrate that 
TILRR is a transcriptional variant of FREMl whose transcriptional start site lies within the intronic 
sequence of FREMl. TILRR is lacking in early species such as teleosts and invertebrates, being first 
identifiable in amphibians. These findings highhght that although innate immunity is evolutionarily 
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ancient, refinements to the system have continued to arise until more recently and that important 
differences exist between model organisms used to study inflammation. 

2. Results and Discussion 

In order to determine the evolutionary development of the IL-IRI co-receptor TILRR, we identified 
the TILRR isoform of FREMl within the genomes of multiple organisms and defined the source of the 
TILRR peptide sequence within the nucleotide sequence of the FREMl loci. Alignment of the peptide 
sequences of human TILRR and FREMl show that they are identical from R17 of TILRR (R1481 of 
FREMl). This TILRR/FREMl consensus region is encoded by exons 25 onwards in the human 
FREMl gene. 

To predict the location within the Human FREMl locus where TILRR transcripts initiate, we 
analysed the Ensembl annotation of the 2,179 amino acid encoding Human FREMl gene 
(ENST00000452036). Studying the cDNA sequence of this transcript revealed R1481 to be encoded 
within a residue overlap splice site and due to ligation of the final two nucleotides of exon 24, and the 
first of exon 25 (Figure 1). A lack of homology between the 16 N-terminal TILRR residues and any 
other part of FREMl suggested that no early FREMl coding exon is ligated to exon 25 to encode 
TILRR. Therefore, we reasoned that the iV-terminus (translational start site) of TILRR must reside 
within a sequence of FREMl not incorporated into the processed FREMl mRNA. We hypothesized 
that the iV-terminus of TILRR could be located by examining the annotation of the FREMl transcript, 
prior to the first exon common to both TILRR and FREMl (exon 25). Translation in all three reading 
frames of the intronic nucleotide sequence immediately preceding exon 25 produced the unique 
16 TILRR A/-terminal residues that we previously sequenced using MALDI-TOF [10]: 
MVTQESMLKAALPLFT followed by R17 and the remaining residues up to Q155. As the A^-terminus 
of TILRR is produced by a series of consecutive nucleotides immediately prior to and in frame with 
the third nucleotide of the R1481 codon of FREMl, during RNA processing, FREMl mRNA 
transcripts arise when exon 24 is ligated to a 3' splice acceptor site prior to the third nucleotide of the 
R1481 codon, whereas exon 1 of TILRR commences within intron 23-24 and runs into exon 24 
without the requirement for such an acceptor splice site (Figure IB). This initiation of a novel 
"orphan" gene from a non-coding sequence is a recently described mechanism, which in many cases 
allows the organism to adapt to novel conditions [12-14]. 

Analysis of the Mouse Freml locus (ENSMUSG00000059049) in the same manner as Human 
FREMl, revealed a similar spUcing mechanism: the iV-terminal 16 residues of Mouse TILRR 
(MGTQEPMLKAALPLFA, as we previously showed by peptide sequencing) [10] are encoded by an 
intronic nucleotide sequence upstream of exon 25 of the Freml transcript. This is consistent with the 
suggestion that TILRR and FREMl mRNAs arise due to alternative transcriptional initiation of the 
TILRR mRNA within intron 24-25 of the FREMl gene. 

Since analysis of both the Human FREMl and Mouse Freml loci clearly identified the 5' TILRR 
coding sequence within the intron preceding exon 25, we reasoned that examining this region in other 
species would allow determination of whether each organism possesses an ortholog of TILRR. 
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Figure 1. (A) Schematic diagram of Human and Mouse FREMl pre-mRNA. (B) DNA 
sequence and peptide translation of human and mouse FREMl in region of boundary 
between exons 24 and 25 (upper panel) and TILRR showing genomic sequence and 
translated peptide. 
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H sapiens 9'9 ^'^^ 9^^ ^'^^ ^'9 '''9 ^^9 9''' 9'''' *'9 ^ ^9^ 
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p: M. musculus atg ggg aca caa gag ccc atg ctg aag get gcc ttg cct etc ttt gcc agA TTT ACC ATC AGC 
MGTQEPMLKAALPLFARFT I S 

(A) Exons numbered as displayed in the ENSEMBL transcript (ENST00000452036). The diagram 
outlines the splicing events producing mature FREMl mRNA (shown in grey) and the alternate 
splice events within the intronic region of exons 24-25, which give rise to TILRR mRNA (shown in 
black). As indicated by ENST00000380894, TILRR transcripts include a 5' UTR ending within the 
intronic region of the annotated FREMl exons 24-25. Following FREMl exon 25, splicing events 
for both FREMl and TILRR are identical. * As of the time of writing, it is unknown how far 
upstream the 5' UTR of TILRR extends upstream of exon 25. (B) Intron 24-25 and Exon 25 of 
FREMl are also Exon 1 and 2 of TILRR. 



We therefore extended our investigation to examining the genomes of 37 tetrapod organisms to 
identify the 5' end of the TILRR coding sequence within each Freml ortholog. We used the predicted 
FREMl cDNA transcript sequences to identify the region of each locus corresponding to the exon 
containing alternative 3' splice acceptor sites as in exon 25 of Human FREMl. The preceding 
nucleotides were translated in frame with the nucleotide sequence of the identified exon and the 
resulting peptide sequence aligned with the TILRR N-termini. For 33 tetrapod species, 16 consecutive 
amino acids could be produced in frame with R17, suggesting that a strongly conserved TILRR 
homolog exists in these organisms (Figure 2). 
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Although four tetrapod FREMl loci (C. familiaris, D. ordii, T. syrichta and E. telfairi) could not 
immediately be translated into the 5' TILRR residues, we found that all four possess a single 
nucleotide alteration compared to the consensus TILRR peptide sequence, but that with this exception 
the consensus iV-terminal TILRR sequence was preserved, indicating these species are likely to possess 
the TILRR isoform of FREMl (Figure 3). Alternatively, in these species TILRR may constitute a 
pseudogene, which produces a non-flinctional protein, although it is highly likely that all tetrapod 
homologues arose from a common ancestor. Future studies are needed to assess the significance of 
these mutations in signal amplification of the TIR domain. 

Since we had identified TILRR orthologues in all tetrapod species studied, we next analysed teleost 
FREMl homologs using annotations of all identified teleost FREMl orthologs in the Ensembl 
database. All teleost species possessed at least one FREMl orthologue. However, in these organisms 
no such FREMl ortholog contained a conserved 5' coding sequence indicative of a tilrr transcript 
(Figure 4). 



Figure 2. Comparison of TILRR A^-terminal sequence. 
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The 17-residue TILRR N-terminal sequences as identified within FREMl gene records including 
Human and Mouse TILRR, with consensus sequence. 
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Figure 3. Identification of single base mutations responsible for non-homologous sequence 
in C. familiaris, D. ordii, T. syrichta and E. telfairi. 
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For C. familiaris, D. ordii, T. syrichta and E. telfairi, analysis of the nucleotide sequences and 
corresponding translations revealed that single base mutations (highlighted in black) are responsible 
for non-homologous iV-terminal sequences. H. sapiens TILRR is also shown for comparison. 



Figure 4. FREMl orthologs in fish lack the TILRR iV-terminus. 
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Diagram of FREMl s orthologues in teleost fish, demonstrating lack of potential TILRR TV-termini 
like those identified in other organisms, and suggests that TILRR A^-termini do not exist within 
fish freml. 

We next examined invertebrate species for orthologues of FREMl /TILRR. Using BLASTP, we 
were unable to locate orthologues of either the consensus regions shared between Freml and Tilrr or 
the Tilrr 5' sequence in non- vertebrates {Drosophila melanogaster, Ciona intestinalis, Caenorhabditis 
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elegans). We therefore concluded that invertebrates do not possess FREMl, nor a TILRR orthologue. 
Thus, FREMl arose after the evolution of vertebrates, but TILRR only becomes detectable after the 
divergence of the teleosts. 

To investigate possible mechanisms for the absence of TILRR in teleosts, we examined the 
exon/exon boundary sequences of FREMl in human, mouse, xenopus and four teleosts (Danio rerio 
[zebrafish], Oryzias latipes [medaka], Takifugu rubripes [fiigu] and Gasterosteus aculeatus 
[stickleback] and the corresponding intron/exon sequences that encode for TILRR in tetrapods but not 
in teleosts (Figure 5). We found that whereas the exonic and particularly translated sequence of 
FREMl were reasonably well conserved even between mammals and telosts, comparison of the 
intronic region encoding for TILRR in tetrapods reveals marked divergence. Although all introns end 
with the major spliceosomal AG consensus splice site, the actual intronic sequences diverged markedly 
between teleosts and mammals. Although N termini often vary greatly in length and sequence between 
homologues, in zebratish, medaka and stickleback translation of the ORF of the contiguous intronic 
sequence preceding the shared freml exon leads to a stop codon between the shared exon and the 
earliest possible methionine start codon (Figure 5). In Fugu there is a methionine in the N-terminal 
sequence that could represent a TILRR start codon, although there is no equivalent of this in the 
human sequence (Figure 5). Given the otherwise high conservation of the freml gene between teleosts, 
it seems unlikely that Fugu possess a TILRR homologue when the other three teleosts do not. It seems 
likely that TILRR arose through a major alteration of intronic sequence, rather than a more subtle 
perturbation that gave rise to generation of novel intronic transcription binding sites. 

We therefore conclude that the TILRR isoform of FREMl is present only in tetrapod organisms, 
presenting two possibilities for the origin of TILRR. This may reflect that it originates from an ancestor 
common to both teleost and tetrapod organisms that arose after the invertebrates, and was lost in 
FREMl paralogs prior to the evolution of modern teleosts. Alternatively, (an explanation we favour) 
TILRR may have originated following the divergence of a common ancestor into the tetrapod lineage, 
hence its first detection within amphibian FREMl (Figure 6). 

Either possibility suggests that, in contrast to a majority of IL-IRI complex components, which are 
present in primitive species such as D. rerio [7,9], TILRR is not involved in IL-lRl controlled 
responses to pathogenic invasion in ancestral or modern day teleosts. 

The work in this study shows the maturation of the IL-1 receptor complex within the timeframe 
between the divergent evolution of teleost fish and tetrapod amphibians some 360^50 million years 
ago [15]. The conservation of TILRR within the genomes of tetrapod organisms likely represents 
refinement of IL-1 signaling over the course of vertebrate evolution, to allow increased sensitivity of 
system control through ligand concentration and receptor levels. 

The finding that TILRR does not exist in any teleost studied suggests that distinct components of the 
vertebrate IL-lRl complex may have evolved at different stages of the evolutionary tree, perhaps 
reflecting fiinctions related to TILRR controlled environmental sensing and attachment. The lack of 
TILRR expression in primitive species, such as D. rerio, in addition to the absence of Syndecan 1 and 
the TLR4 co-receptors [8,9], also supports the notion that inflammatory signaling regulatory 
mechanisms in mammals are not all synonymous with that of lower vertebrates. Common features of 
the co-receptors of these systems are related to ligand/receptor interactions and receptor sensitivity to 
ligand, allowing for increasing variability and specificity over a range of ligand and receptor levels. 
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Similarly, recently identified TILRR mutants demonstrate selective regulation of distinct cellular 
responses related to inflammation and cell survival, thus contributing refined control and increased 
specificity [16]. 

Figure 5. Comparative alignment of coding sequence of FREMl and TILRR in mammals, 
amphibians and teleosts. 
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Upper panel: Alignments of the exon/exon sequence coding for FREMl in human, mouse, 
xenopus, medaka, zebrafish, stickleback and fugu (red shows sequence for exon 24, blue exon 25) 
with translated protein sequence. Lower panel: Alignments of the intron/exon sequence coding for 
TILRR in human, mouse, xenopus, medaka, zebrafish, stickleback and fugu (green shows intron 
sequence for intron immediately preceding exon 25, blue exon 25) with translated protein sequence 
(for human mouse and xenopus) and hypothetical translated protein sequence for teleosts to show 
the protein sequence if the intronic sequence were included in a teleost TILRR orthologue in the 
same manner as mammals or amphibians. Note the presence of premature stop codons (*) in three 
of the four teleost sequences (a stop codon arises in the zebrafish intronic sequence upstream of 
that shown) with only Fugu possessing a methionine downstream to the stop codon that could 
represent a putative start ATG. 
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Figure 6. Evolutionary development of FREMl and presence of putative TILRR sequence. 



H.sapiens ENST00000452036 
P. troglodytes ENSPTRT00000056681 
G.gorilla ENSGGOTOOOOOOl 4783 
P.pygmaeus ENSPPYT00000022394 
M.mulatta ENSMMUT00000046207 
C.jacchus ENSCJAT00000023873 

C. familiaris ENSCAFT00000002397 
E.caballus ENSECAT0000001 681 4 

B. taurus ENSBTAT00000026827 
O.cuniculus ENSOCUT0000001 0391 
M.musculus ENSMUST00000071708 
R.norvegicus ENSRNOT0000005861 5 

D. ordii ENSDORTOOOOOOl 4455 
M.domestica ENSMODTOOOOOOl 9082 
G.gallus ENSGALT0000000871 6 
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Axarolinensis ENSACAT0000001 7859 
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G.aculeatus ENSGACT000000041 35 
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X.tropicalis ENSXETT0000001 9485 

C. savignyi ENSCSAVT00000002226 

Slanted cladogram displaying the evolutionary divergence of FREMl in organisms for which 
complete Freml peptide sequences could be identified within Ensembl. FREMl protein sequences 
were obtained from the Ensembl transcripts as listed for each organism as orthologs of human 
FREMl. Representation of multiple Freml transcripts indicates the presence of a FREMl paralog. 
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3. Experimental Section 

8.1. Obtaining Predicted Freml Peptide Sequences 



Genbank [17] was used as a source for TILRR and FREMl peptide sequences. GenelDs and 
NCBI [18] Reference sequences for the peptides used to compare with predicted transcripts are 
listed below (Table 1). 



Table 1. Accession numbers of FREMl sequences. 



Organism 


GenelD 


Protein Name 


NCBI Reference sequence 


H. sapiens 


158326 


FREMl isoform 2 (TILRR) 


NP_001171175 


FREMl isoform 1 precursor 


NP 659403 


M. musculus 


329872 


FREMl precursor 


NP 808531 


D. rerio 


100216326 


Freml a 


NP_00 1177237 


557221 


Fremlb 


NP_001131130 
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Ensembl was used to locate predicted FREMl gene loci, identified as orthologs of the Ensembl 
annotation of Human Fi?£Mi (ENSGOOOOO 164946) or of Zebrafish fremlb (ENSDARG00000062402) [19]. 

Predicted Ensembl FREMl ortholog transcripts are shown in Supplementary Figiire 1 . 

Previous studies ofD. rerio Freml used probes deduced by sequence analysis to analyse expression 
of the two orthologs, Freml a and Fremlb, which we aligned to the predicted protein sequences as 
listed in Ensembl. By this method, we found that D. rerio Freml a had a greater similarity to the 
ENSDART00000090165 transcript annotation in the Zv7 genome as compared to the more recent Zv8 
genome annotation. Conversely, the fremlb transcript had perfect similarity to ENSDART00000090217 
on Zv8. For these reasons we used the Zv7 annotation of ENSDART00000090165 as a basis for the 
exon structure of freml a, and the Zv8 annotation of ENSDART00000090217 as a basis for the exon 
structure of fremlb. 

3.2. Sequence Alignments 

All peptide and nucleotide sequence alignments were performed using CLUSTALW and 
CLUSTALX. A slanted cladogram of FREMl peptide sequences was constructed using the UPGMA 
algorithm in CLUSTALX displayed using TreeViewX. 

4. Conclusions 

Our data show that TILRR is a recent addition to the IL-IRI signaling system. These findings, and 
those of others looking at evolutionary development of regulatory pathways, are consistent with a role 
for co-receptors in modulating response control in higher organisms. This opens interesting 
possibilities for investigating development of regulatory intermediates and delineating mechanisms 
underlying the increased sensitivity and complexity characteristic of maturation of biological systems 
during evolution. 
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